Flexible Acquisition of Verb Subcategorization Frames in Italian

نویسندگان

  • Tommaso Caselli
  • Francesco Rubino
  • Francesca Frontini
  • Irene Russo
  • Valeria Quochi
چکیده

This paper describes a web-service system for automatic acquisition of verb subcategorization frames (SCFs) from parsed data in Italian. The system acquires SCFs in an unsupervised manner. We created two gold standards for the evaluation of the system, the first by mixing together information from two lexica (one manually created and the second automatically acquired) and manual exploration of corpus data and the other annotating data extracted from a specialized corpus (domain environment). Data filtering is accomplished by means of the maximum likelihood estimate (MLE). In addition to this, we assign to the extracted entries of the lexicon a confidence score and evaluate the extractor on domain specific data. The confidence score will allow the final user to easily select the entries of the lexicon in terms of their reliability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bengali Verb Subcategorization Frame Acquisition - A Baseline Model

Acquisition of verb subcategorization frames is important as verbs generally take different types of relevant arguments associated with each phrase in a sentence in comparison to other parts of speech categories. This paper presents the acquisition of different subcategorization frames for a Bengali verb Kara (do). It generates compound verbs in Bengali when combined with various noun phrases. ...

متن کامل

Unsupervised Acquisition of Verb Subcategorization Frames from Shallow-Parsed Corpora

In this paper, we reported experiments of unsupervised automatic acquisition of Italian and English verb subcategorization frames (SCFs) from general and domain corpora. The proposed technique operates on syntactically shallow-parsed corpora on the basis of a limited number of search heuristics not relying on any previous lexico-syntactic knowledge about SCFs. Although preliminary, reported res...

متن کامل

The Automatic Acquisition Of Frequencies Of Verb Subcategorization Frames From Tagged Corpora

We describe a mechanism for automatically acquiring verb subcategorization frames and their frequencies in a large corpus. A tagged corpus is first partially parsed to identify noun phrases and then a finear grammar is used to estimate the appropriate subcategorization frame for each verb token in the corpus. In an experiment involving the identification of six fixed subcategorization frames, o...

متن کامل

Acquiring Verb Subcategorization Frames in Bengali from Corpora

Subcategorization frames acquisition of a phrase can be described as a mechanism to extract different types of relevant arguments that are associated with that phrase in a sentence. This paper presents the acquisition of different subcategory frames for a specific Bengali verb that has been identified from POS tagged and chunked data prepared from raw Bengali news corpus. Syntax plays the main ...

متن کامل

Learning Automatic Acquisition of Subcategorization Frames Using Bayesian Inference and Support Vector Machines

Learning Bayesian Belief Networks (BBN) from corpora and Support Vector Machines (SVM) have been applied to the automatic acquisition of verb subcategorization frames for Modern Greek. We are incorporating minimal linguistic resources, i.e. basic morphological tagging and phrase chunking, to demonstrate that verb subcategorization, which is of great significance for developing robust natural la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012